Model Selection

Large-scale Visual Representation

# Large-scale Visual Representation

Vit So400m Patch16 Siglip 256.webli I18n

A vision Transformer model based on SigLIP, focusing on image feature extraction with original attention pooling mechanism.

Image Classification

Vit Large Patch14 Clip 224.datacompxl

A vision Transformer model based on the CLIP architecture, specifically designed for image feature extraction, released by the LAION organization.

Image Classification

Convnext Base.clip Laion2b Augreg

ConvNeXt Base image encoder based on the CLIP framework, trained on the LAION-2B dataset, supports image feature extraction

Image Classification

Convnext Base.clip Laion2b

CLIP image encoder based on ConvNeXt architecture, trained by LAION, suitable for multimodal vision-language tasks

Image Classification

Resnet50x64 Clip.openai

CLIP model based on the ResNet50x64 architecture from the OpenCLIP library, supporting zero-shot image classification tasks.

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase